Search CORE

122 research outputs found

Minimal Suffix and Rotation of a Substring in Optimal Time

Author: Kociumaka Tomasz
Publication venue
Publication date: 01/01/2016
Field of study

For a text given in advance, the substring minimal suffix queries ask to determine the lexicographically minimal non-empty suffix of a substring specified by the location of its occurrence in the text. We develop a data structure answering such queries optimally: in constant time after linear-time preprocessing. This improves upon the results of Babenko et al. (CPM 2014), whose trade-off solution is characterized by

\Theta(n\log n)

product of these time complexities. Next, we extend our queries to support concatenations of

O(1)

substrings, for which the construction and query time is preserved. We apply these generalized queries to compute lexicographically minimal and maximal rotations of a given substring in constant time after linear-time preprocessing. Our data structures mainly rely on properties of Lyndon words and Lyndon factorizations. We combine them with further algorithmic and combinatorial tools, such as fusion trees and the notion of order isomorphism of strings

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Constant Factor Approximation for Capacitated k-Center with Outliers

Author: Cygan Marek
Kociumaka Tomasz
Publication venue
Publication date: 01/01/2014
Field of study

The

k

-center problem is a classic facility location problem, where given an edge-weighted graph

G = (V,E)

one is to find a subset of

k

vertices

S

, such that each vertex in

V

is "close" to some vertex in

S

. The approximation status of this basic problem is well understood, as a simple 2-approximation algorithm is known to be tight. Consequently different extensions were studied. In the capacitated version of the problem each vertex is assigned a capacity, which is a strict upper bound on the number of clients a facility can serve, when located at this vertex. A constant factor approximation for the capacitated

k

-center was obtained last year by Cygan, Hajiaghayi and Khuller [FOCS'12], which was recently improved to a 9-approximation by An, Bhaskara and Svensson [arXiv'13]. In a different generalization of the problem some clients (denoted as outliers) may be disregarded. Here we are additionally given an integer

p

and the goal is to serve exactly

p

clients, which the algorithm is free to choose. In 2001 Charikar et al. [SODA'01] presented a 3-approximation for the

k

-center problem with outliers. In this paper we consider a common generalization of the two extensions previously studied separately, i.e. we work with the capacitated

k

-center with outliers. We present the first constant factor approximation algorithm with approximation ratio of 25 even for the case of non-uniform hard capacities.Comment: 15 pages, 3 figures, accepted to STACS 201

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Approximating Upper Degree-Constrained Partial Orientations

Author: Cygan Marek
Kociumaka Tomasz
Publication venue
Publication date: 10/10/2014
Field of study

In the Upper Degree-Constrained Partial Orientation problem we are given an undirected graph

G=(V,E)

, together with two degree constraint functions

d^-,d^+ : V \to \mathbb{N}

. The goal is to orient as many edges as possible, in such a way that for each vertex

v \in V

the number of arcs entering

v

is at most

d^-(v)

, whereas the number of arcs leaving

v

is at most

d^+(v)

. This problem was introduced by Gabow [SODA'06], who proved it to be MAXSNP-hard (and thus APX-hard). In the same paper Gabow presented an LP-based iterative rounding

4/3

-approximation algorithm. Since the problem in question is a special case of the classic 3-Dimensional Matching, which in turn is a special case of the

k

-Set Packing problem, it is reasonable to ask whether recent improvements in approximation algorithms for the latter two problems [Cygan, FOCS'13; Sviridenko & Ward, ICALP'13] allow for an improved approximation for Upper Degree-Constrained Partial Orientation. We follow this line of reasoning and present a polynomial-time local search algorithm with approximation ratio

5/4+\varepsilon

. Our algorithm uses a combination of two types of rules: improving sets of bounded pathwidth from the recent

4/3+\varepsilon

-approximation algorithm for 3-Set Packing [Cygan, FOCS'13], and a simple rule tailor-made for the setting of partial orientations. In particular, we exploit the fact that one can check in polynomial time whether it is possible to orient all the edges of a given graph [Gy\'arf\'as & Frank, Combinatorics'76].Comment: 12 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

Faster Longest Common Extension Queries in Strings over General Alphabets

Author: Gawrychowski Paweł
Kociumaka Tomasz
Rytter Wojciech
Waleń Tomasz
Publication venue
Publication date: 01/01/2016
Field of study

Longest common extension queries (often called longest common prefix queries) constitute a fundamental building block in multiple string algorithms, for example computing runs and approximate pattern matching. We show that a sequence of

q

LCE queries for a string of size

n

over a general ordered alphabet can be realized in

O(q \log \log n+n\log^*n)

time making only

O(q+n)

symbol comparisons. Consequently, all runs in a string over a general ordered alphabet can be computed in

O(n \log \log n)

time making

O(n)

symbol comparisons. Our results improve upon a solution by Kosolobov (Information Processing Letters, 2016), who gave an algorithm with

O(n \log^{2/3} n)

running time and conjectured that

O(n)

time is possible. We make a significant progress towards resolving this conjecture. Our techniques extend to the case of general unordered alphabets, when the time increases to

O(q\log n + n\log^*n)

. The main tools are difference covers and the disjoint-sets data structure.Comment: Accepted to CPM 201

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Internal Pattern Matching Queries in a Text and Applications

Author: Kociumaka Tomasz
Radoszewski Jakub
Rytter Wojciech
Waleń Tomasz
Publication venue
Publication date: 13/10/2014
Field of study

We consider several types of internal queries: questions about subwords of a text. As the main tool we develop an optimal data structure for the problem called here internal pattern matching. This data structure provides constant-time answers to queries about occurrences of one subword

x

in another subword

y

of a given text, assuming that

|y|=\mathcal{O}(|x|)

, which allows for a constant-space representation of all occurrences. This problem can be viewed as a natural extension of the well-studied pattern matching problem. The data structure has linear size and admits a linear-time construction algorithm. Using the solution to the internal pattern matching problem, we obtain very efficient data structures answering queries about: primitivity of subwords, periods of subwords, general substring compression, and cyclic equivalence of two subwords. All these results improve upon the best previously known counterparts. The linear construction time of our data structure also allows to improve the algorithm for finding

\delta

-subrepetitions in a text (a more general version of maximal repetitions, also called runs). For any fixed

\delta

we obtain the first linear-time algorithm, which matches the linear time complexity of the algorithm computing runs. Our data structure has already been used as a part of the efficient solutions for subword suffix rank & selection, as well as substring compression using Burrows-Wheeler transform composed with run-length encoding.Comment: 31 pages, 9 figures; accepted to SODA 201

arXiv.org e-Print Archive

Crossref